This book has now progressed out of alpha to version 1.0.
I have accomplished much of what I set out to do,
which is to play my Karaoke files on my computer.
I now have
-
MP3+G files can be played by
vlc with suitable flags
-
Karaoke files in KAR format are played by
TiMidity
on a CubieBoard 2 with a custom interface built from
Xlib, Cairo and Pango
-
Karaoke lyric files with WMA music files are played by
mplayer for the music with a custom Java
player for the MIDI lyrics
-
Jack is used as underlying audio player as it allows use
of tools such as
jack-rack. This can use
plugins such as the TAP pitch plugin to shift the music
up or down in pitch
If a chapter is labelled "minimal" then there isn't much content there
now, but may be later.
Contents
Introduction
See below
PART 1: SAMPLED AUDIO
-
Basic concepts of digital sound (last modified v0.9)
- Resources
- Sampled audio
- Sample rate
- Sample format
- Frames
- Pulse-code modulation
- Over and Under Run
- Latency
- Jitter
- Mixing
- Conclusion
-
User level sound tools (minimal) (last modified v0.1)
- Players
- Sound tools
- Recorders, editors, etc
-
Sound codecs and file formats (minimal) (last modified v0.12)
- Overview
- PCM
- WAV
- MP3
- Ogg Vorbis
- WMA
- Matroska
-
Overview of Linux sound architecture (last modified v0.9)
-
ALSA (last modified v0.9)
- Resources
- User space tools
- Programming ALSA
- Mixing audio
- Writing an ALSA device driver
- Conclusion
-
Pulse Audio (last modified v0.10)
- Resources
- Starting, stopping and pausing PulseAudio
- User space tools
- Programming with PulseAudio
- Simple API
- Asynchronous API
- Conclusion
-
Jack (last modified v0.18)
- Introduction
- Resources
- Starting Jack
- User tools
- Applications using Jack
- Using a different soundcard
- How can I use multiple soundcards with JACK?
- Mixing audio
- Writing Audio Applications With JACK
- Libraries
- Port information
- Copy input to output
- Delaying audio
- Audacity with Jack
- Play a sine wave
- Saving input to disk
- Interacting with ALSA devices
- Conclusion
-
Session Management (last modified v0.22)
- Resources
- Session management issues
- jack_connect
- LASH
- Jack sessions
- LADISH
- Non-session manager
- Jack session API
- LADISH API
- Non-session management API
-
Java Sound (last modified v0.19)
- Introduction
- Key Java Sound classes
- Information about devices
- Playing audio from a file
- Recording audio to a file
- Play microphone to speaker
- Where does javaSound get its devices from?
- Summary
-
GStreamer (minimal) (last modified v0.5)
-
libao (last modified v0.9)
-
FFmpeg (last modified v0.9)
- Resources
- FFmpeg command line tools
- libavformat/libavdecode
-
OpenMAX and OpenSL (last modified v0.8)
- Resources
- OpenMAX AL and OpenSL ES
- OpenMAX and Gstreamer
- Linux
- Raspberry Pi
- Android
- OpenSL ES examples
- OpenMAX AL example
- OpenMAX IL
- Ogg Vorbis
- Audio decoding
- Conclusion
PART 2: DIGITAL SIGNAL PROCESSING
-
LADSPA (last modified v0.20)
- Resources
- Files
- Introduction
- User level tools
- The type LADSPA_Descriptor
- Loading a plugin
- The
amp program
- The
analysePlugin client
- A mono amplifier client
- A stereo amplifer with GUI
- Conclusion
PART 3: DIVERSIONS
Any chapters in here are nothing to do with sound,
but may be useful in applications which use sound.
-
Displaying video with overlays using Gtk and FFMpeg (last modified v0.14)
- Introduction
- FFMpeg
- Basic Gtk
- Versions of Gtk
- Displaying the video using Gtk
- Overlaying an image on top of an image
- Alpha channel
- Using Cairo to draw on an image
- Drawing text using Pango
- Conclusion
PART 4: MIDI
-
MIDI (minimal) (last modified v0.3)
-
User level MIDI tools (minimal) (last modified v0.13)
- Sound fonts
- Timidity
- Rosegarden
- GStreamer
- FluidSynth
- Wild MIDI
- Comparison
- Programming with Timidity
-
JavaSound (last modified v0.3)
- Introduction
- Resources
- Key JavaSound MIDI classes
- Device Information
- Dumping a MIDI file
- Playing a MIDI file
-
ALSA (last modified v0.6)
- Resources
- Introduction
- aconnect
- seqdemo
- aplaymidi
-
GStreamer (minimal) (last modified v0.5)
-
FluidSynth (minimal) (last modified v0.6)
- Resources
- Players
- Play MIDI files
- Python
-
Timidity (last modified v0.16)
- Files
- Timidity design
- Making TiMidity into a library
- Building a new interface
- Summary
PART 5: KARAOKE
-
Overview (last modified v0.16)
-
User Level Tools (last modified v0.11)
- Karaoke
- Video CD systems
- CD+G disks
- MP3+G files
- Buying CD+G or MP3+G files
- Converting MP3+G to video files
- MPEG 4 files
- Karaoke machines
- MIDI players
- Finding MIDI files
- KAR file format
- pykaraoke
- kmid
- Microphone inputs and reverb effects
- Conclusion
-
Playing MP3+G files (last modified v0.19)
- Files
- Introduction
- File organisation
- Song information
- Song table
- Favourites
- All favourites
- Swing song table
- Playing songs
- VLC
- Playing songs across the network
- Conclusion
-
Decoding the DKD files on the Sonken Karaoke DVD (last modified v0.19)
- Introduction
- Format shifting
- Files on the DVD
- Decoding DTSMUS20.DKD
- The data files
- Decoding MIDI files
- Playing MIDI files
- Playing WMA files
- KAR format
- Playing songs with pykar
- Conclusion
-
JavaSound (last modified 0.4)
- Resources
- Introduction
- KaraokePlayer
- MidiPlayer
- DisplayReceiver
- MidiGUI
- Song information
- AttributedLyricPanel
- PianoPanel
- MelodyPanel
- SequenceInformation
- PinYin
- Karaoke player with sampling
- Comments on device choices
- Performance
- Downloads
- Conclusion
-
Subtitles (last modified v0.12)
- Resources
- Introduction
- Subtitle formats
- MPlayer
- VLC
- GStreamer
- Gnome subtitles
- SubStation Alpha
- Karaoke effects in ASS files
- Multi-line Karaoke
- libass
- Converting KAR files to MKV files with ASS subtitles
- HTML5 subtitles
- Conclusion
-
FluidSynth (last modified v0.15)
- Resources
- Files
- Players
- Play MIDI files
- Extending FluidSynth with callbacks
- Displaying and colouring text with Gtk
- Playing a background video with Gtk
- Conclusion
-
Timidity (last modified v1.0)
- Files
- Introduction
- TiMidity and Jack
- TiMidity interface
- Adding an interface to TiMidity
- Getting the list of lyrics
- TiMidity options
- Playing lyrics using Pnago + Cairo + Xlib
- Playing a background video with Gtk
- Background video with TiMidity as library
- Background video with TiMidity as front-end
- Adding microphone input
- Conclusion
-
Jack (last modified v1.0)
- Using jack rack for effects
- Playing MIDI
- TiMidity plus Jack Rack
- Customising TiMidity build
- Playing MP3+G with Jack Rack pitch shifting
- Conclusion
PART 6: STREAMING AUDIO
-
HTTP (minimal) (last modified v0.5)
- HTTP servers
- HTTP clients
- Streaming vs downloading
-
Icecast (minimal) (last modified v0.5)
-
DLNA (minimal) (last modified v0.5)
- Resources
- Introduction
- DLNA open source projects
-
Flumotion (minimal) (last modified v0.11)
PART 7: MISCELLANEOUS
-
Android (last modified v0.7)
- Resources
- Identifying devices
- My Android experience
- Playing files
- Streaming audio
- Recording audio
- Playing audio from the microphone
- MIDI playback
- OpenMAX
- Conclusion
-
Raspberry Pi (last modified v0.11)
- Resources
- Introduction
- No sound
- ALSA
- Sampled audio players
- Sampled audio capture
- MIDI players
- JavaSound
- PulseAudio
- Java MIDI
- OpenMAX
- Conclusion
Introduction
Linux is a major operating system that can not only do what
every other operating system can do, but can also do a lot more.
But because of its size and complexity it can be hard to learn
how to do any particular task. This is reinforced by its development
model: anyone can develop new components, and indeed Linux
relies on a huge army of paid and unpaid volunteers to do just that.
But that can lead to confusion: if two methods are developed for
one task, which one should be chosen? or more subtly, what are the
distinguishing features of one solution that make it more appropriate
for your problem?
The Linux sound system is a major example of this: there is a large
variety of tools and approaches for almost every aspect of sound.
This ranges from audio codecs, to audio players, to audio support
both within and outside of the Linux kernel.
I've been using Linux since kernel 0.99. I'm not a kernel hacker,
more of a user and a programmer at the application layer. But I've
always got lost whenever I want to do something complex involving Linux
sound. So I've decided out of sheer bravado to try to describe
the range of solutions to Linux sound issues. Of course, I'm way
out of my depth, but that makes it a challenge, not a hindrance!
I'm going to rely on the people who really know what they
are doing. So rather than re-invent everything from scratch
I'm going to borrow the words of experts whenever I can.
But the responsibility will still be mine, so
please forgive and correct all the errors that I will make...
Acknowledgements
I have used the following in these pages
Copyright © Jan Newmarch, jan@newmarch.name
If you like this book, please contribute using Flattr
or donate using PayPal
CHANGES
-
In v0.22
-
Added section on jack_connect to session management
-
Added basic karaoke display interface to TiMidity
-
In v1.0
-
Added chapter on Jack with Karaoke
-
In v0.20
-
Added examples of LADSPA clients to LADSPA chapter
-
In v0.19
-
Added material showing links from JavaSound to underlying sound system
-
Added section on Sonken disk for playing the files using
PyKaraoke
-
Added chapter on playing MP3+G files
-
In v0.18
-
Added examples to Jack Sampled chapter
-
Added chapter on LADSPA
-
Added chapter on (Jack) session managment
-
In v0.17
-
Revised chapter on TiMidity Karaoke to include using an
interface
-
Expanded the Jack chapter to include the progamming
model and sample programs
-
In v0.16
-
Added Overview chapter to Karaoke
-
Added section on building interfaces to TiMidity chapter
-
In v0.15
-
Added chapter on FluidSynth as a Karaoke player
-
Added chapter on TiMidity as a Karaoke player
-
In v0.14
-
Added chapter on Gtk, and new part "Diversions" to hold it
-
In v0.13
-
Added stuff to MIDI user tools
-
Added chapter on programming with Timidity
-
In v0.12
-
Section on Matroska container format added to Codecs
-
Chapter on Subtitles added to Karaoke section
-
In v0.11
-
New (stub) chapter on Flumotion
(thanks to David Marceau, uticdmarceau2007 at yahoo.ca)
-
Updated Raspberry Pi
-
Updated Karaoke user level
-
In v0.10
-
Many updates to the chapter on PulseAudio to include volume control, mixing
and application clients
-
In v0.9
-
Updated chapter on Architecture
-
Updated chapter on Basic concepts
-
Section on mixing added to ALSA
-
Chapter on libao added
-
Chapter on FFmpeg added
-
In v0.8
-
Added more stuff on ALSA to Raspberry Pi chapter
-
Added more on OpenMAX IL to the OpenMAX chapter
-
Added in the LIM implementation of OpenMAX IL
-
Started changing the program listing colouring to
SHJS JavaScript rather than my old Perl program
(which could only cope with my programming style)
-
In v0.7
-
New chapter on OpenMAX AL, OpenMAX IL and OpenSL ES as used in the
Raspberry Pi and on Android
-
In v0.6
-
Update to Timidity as a MIDI sequencer server.
-
New chapter on FluidSynth
-
Updated Raspberry Pi
-
Updated Android
-
In v0.5
-
Chapters on streaming audio, the beginning of one on Android
and one on the Raspberry Pi.
-
In v0.4
-
Chapters on Karaoke added
-
In v0.3
-
Added chapter on MIDI overview - pointers to other stuff for now
-
Added chapter on JavaSound MIDI API
-
Added minimal chapter on MIDI user tools
-
In v0.2
-
ALSA playback of captured audio finally works after hours
figuring out what the undocumented stuff does :-(